NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

IP-FL: Incentive-driven Personalization in Federated Learning.

Khan, Ahmad_Faraz; Wang, Xinran; Le, Qi; Abdeen, Zain ul; Khan, Azal Ahmad; Ali, Haider; Jin, Ming; Ding, Jie; Butt, Ali R; Anwar, Ali (June 2025, Proceedings of the 39th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2025))

Free, publicly-accessible full text available June 3, 2026
CIWARS: A Web Server for Antibiotic Resistance Surveillance Using Longitudinal Metagenomic Data

https://doi.org/10.1016/j.jmb.2025.169159

Emon, Muhit Islam; Cheung, Yat Fei; Stoll, James; Rumi, Monjura Afrin; Brown, Connor; Choi, Joung Min; Moumi, Nazifa Ahmed; Ahmed, Shafayat; Song, Haoqiu; Sein, Justin; et al (August 2025, Journal of Molecular Biology)

Free, publicly-accessible full text available August 1, 2026
TreeCNN and NILMTK Unite: Illuminating Energy Efficiency in Real-World Scenarios

https://doi.org/10.1109/BigData62323.2024.10825584

Afroz, Sabiha; Ramanan, Buvana; Khan, Manzoor; Butt, Ali R (December 2024, IEEE)

Full Text Available
User-based I/O Profiling for Leadership Scale HPC Workloads

https://doi.org/10.1145/3700838.3700865

Yazdani, Ahmad Hossein; Paul, Arnab K; Karimi, Ahmad Maroof; Wang, Feiyi; Butt, Ali (January 2025, ACM)

Full Text Available
FedCaSe: Enhancing Federated Learning with Heterogeneity-aware Caching and Scheduling

https://doi.org/10.1145/3698038.3698559

Khan, Redwan_Ibne Seraj; Paul, Arnab K; Cheng, Yue; Jian, Xun; Butt, Ali R (November 2024, ACM)

Federated learning (FL) has emerged as a new paradigm of machine learning (ML) with the goal of collaborative learning on the vast pool of private data available across distributed edge devices. The focus of most existing works in FL systems has been on addressing the challenges of computation and communication heterogeneity inherent in training with edge devices. However, the crucial impact of I/O and the role of limited on-device storage has not been explored fully in FL context. Without policies to exploit the on-device storage for placement of client data samples, and schedule clients based on I/O benefits, FL training can lead to inefficiencies, such as increased training time and impacted accuracy convergence. In this paper, we propose FedCaSe, a framework for efficiently caching client samples in-situ on limited on-device storage and scheduling client participation. FedCaSe boosts the I/O performance by exploiting a unique characteristic--- the experience, i.e., relative impact on overall performance, of data samples and clients. FedCaSe utilizes this information in adaptive caching policies for sample placement inside the limited memory of edge clients. The framework also exploits the experience information to orchestrate the future selection of clients. Our experiments with representative workloads and policies show that compared to the state of the art, FedCaSe improves the training time by 2.06× for accuracy convergence at the scale of thousands of clients.
more » « less
Full Text Available
Memory Allocation Under Hardware Compression

https://doi.org/10.1109/MICRO61859.2024.00075

Laghari, Muhammad; Liu, Yuqing; Panwar, Gagandeep; Bears, David; Jearls, Chandler; Srinivas, Raghavendra; Choukse, Esha; Cameron, Kirk W; Butt, Ali R; Jian, Xun (November 2024, IEEE)

Full Text Available
Application-Attuned Memory Management for Containerized HPC Workflows

Arif, Moiz; Maurya, Avinash; Rafique, M. Mustafa; Nikolopoulos, Dimitrios S.; Butt, Ali R. (May 2024, IEEE International Parallel & Distributed Processing Symposium (IPDPS))

Full Text Available
Application-Attuned Memory Management for Containerized HPC Workflows

https://doi.org/10.1109/IPDPS57955.2024.00019

Arif, Moiz; Maurya, Avinash; Rafique, M Mustafa; Nikolopoulos, Dimitrios S; Butt, Ali R (May 2024, IEEE)

Full Text Available
Application-Attuned Memory Management for Containerized HPC Workflows

Arif, Moiz; Maurya, Avinash; Rafique, M. Mustafa; Nikolopoulos, Dimitrios S.; Butt. Ali R. (May 2024, Proceedings of the 38th IEEE International Parallel & Distributed Processing Symposium (IPDPS))

Full Text Available
Tarazu: An Adaptive End-to-end I/O Load-balancing Framework for Large-scale Parallel File Systems

https://doi.org/10.1145/3641885

Paul, Arnab K; Neuwirth, Sarah; Wadhwa, Bharti; Wang, Feiyi; Oral, Sarp; Butt, Ali R (May 2024, ACM Transactions on Storage)

The imbalanced I/O load on large parallel file systems affects the parallel I/O performance of high-performance computing (HPC) applications. One of the main reasons for I/O imbalances is the lack of a global view of system-wide resource consumption. While approaches to address the problem already exist, the diversity of HPC workloads combined with different file striping patterns prevents widespread adoption of these approaches. In addition, load-balancing techniques should be transparent to client applications. To address these issues, we proposeTarazu, an end-to-end control plane where clients transparently and adaptively write to a set of selected I/O servers to achieve balanced data placement. Our control plane leverages real-time load statistics for global data placement on distributed storage servers, while our design model employs trace-based optimization techniques to minimize latency for I/O load requests between clients and servers and to handle multiple striping patterns in files. We evaluate our proposed system on an experimental cluster for two common use cases: the synthetic I/O benchmark IOR and the scientific application I/O kernel HACC-I/O. We also use a discrete-time simulator with real HPC application traces from emerging workloads running on the Summit supercomputer to validate the effectiveness and scalability ofTarazuin large-scale storage environments. The results show improvements in load balancing and read performance of up to 33% and 43%, respectively, compared to the state-of-the-art.
more » « less
Full Text Available

« Prev Next »

Search for: All records